It is observed that bond0 and all the eth ports on bare metal edge appear as down after upgrade to 3.x and above.
root@Baremetaledge:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ##:##:##:##:fb:0c brd ff:ff:ff:ff:ff:ff
altname enp1s0f0
altname eno3
4: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether #:##:##:#:fb:0d brd ff:ff:ff:ff:ff:ff
altname enp1s0f1
altname eno4
6: eth12: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ##:##:##:##:2e:f0 brd ff:ff:ff:ff:ff:ff
altname enp94s0f0
16: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether ##:##:##:##:a9:61 brd ff:ff:ff:ff:ff:ff
inet x.x.x.x/24 brd x.x.x.x scope global bond0 ----> (We do not see the associated eth links here)
valid_lft forever preferred_lft forever
inet6 fe80::####: ####:####:####/64 scope link
valid_lft forever preferred_lft forever
dmesg logs have the below error
Error:
[ 7.660050] bond0: option slaves: interface +eth4 does not exist!
[ 7.688785] random: crng init done
VMware NSX-T Data Center
VMware NSX
This issue is caused due to a mismatch of the MAC-to-sysname mappings for each eth interface
This is a known issue fixed in VMware NSX 4.2 version and above. Release notes https://techdocs.broadcom.com/us/en/vmware-cis/nsx/vmware-nsx/4-2/release-notes/vmware-nsx-420-release-notes.html
Workaround:
The sysname column indicates the name of the interface with the corresponding MAC address at the time of the script execution or the state before upgrade.
We can verify from the Syslog to determine the sysname corresponding to the eth interface prior to the upgrade of the baremetal edge.
Interface match: ##:##:##:##:fe:a2, sysname(eth8), rulename(eth0) ----> Before the upgrade
Interface match: ##:##:##:##:fe:a3, sysname(eth9), rulename(eth1)
Interface match: ##:##:##:##:0a:7e, sysname(eth10), rulename(eth10)
Interface match: ##:##:##:##:0a:7f, sysname(eth11), rulename(eth11)
Interface match: ##:##:##:##:0a:80, sysname(eth12), rulename(eth12)
Interface match: ##:##:##:##:0a:81, sysname(eth13), rulename(eth13)
Interface match: ##:##:##:##:d0:60, sysname(eth0), rulename(eth2)
Interface match: ##:##:##:##:d0:61, sysname(eth1), rulename(eth3)
Interface match: ##:##:##:##:1d:7c, sysname(eth2), rulename(eth4)
Interface match: ##:##:##:##:1d:7d, sysname(eth3), rulename(eth5)
Interface match: ##:##:##:##:fe:1e, sysname(eth4), rulename(eth6)
Interface match: ##:##:##:##:fe:1f, sysname(eth5), rulename(eth7)
Interface match: ##:##:##:##:58:16, sysname(eth6), rulename(eth8)
Interface match: ##:##:##:##:58:17, sysname(eth7), rulename(eth9)
After the upgrade the sysname and the Mac addresses are mismatched in the rules file located at /etc/udev/rules.d/70-nsx-persistent-net.rules as below.
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:fe:a2", NAME="eth0" ---> Mismatch after the upgrade
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:fe:a3", NAME="eth1"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:0a:7e", NAME="eth10"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:0a:7f", NAME="eth11"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:0a:80", NAME="eth12"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:0a:81", NAME="eth13"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:d0:60", NAME="eth2"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:d0:61", NAME="eth3"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:1d:7c", NAME="eth4"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:1d:7d", NAME="eth5"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:fe:1e", NAME="eth6"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:fe:1f", NAME="eth7"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:58:16", NAME="eth8"
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="##:##:##:##:58:17", NAME="eth9"
We need to fix the mismatch by editing the rule file "/etc/udev/rules.d/70-nsx-persistent-net.rules" to match with the correct corresponding MAC address and sysname before the upgrade.
Make a copy of the rules file before making the correction of the sysname and MAC address.