GENEVE tunnels are down between vtep interfaces on different subnets after upgrading to ESXi 8.X
search cancel

GENEVE tunnels are down between vtep interfaces on different subnets after upgrading to ESXi 8.X

book

Article ID: 382860

calendar_today

Updated On:

Products

VMware NSX VMware vSphere ESX 8.x

Issue/Introduction

  • NSX Transport Nodes upgraded to a 8.X version of ESXi.
  • Routed communication between VM's connected to overlay segments is failing after the upgrade, but communication between VM's on the same segment is successful.
  • The VM's are able to ping their own gateway.
  • After the upgrade GENEVE tunnels are down, but they are only down between vtep interfaces on different subnets.
  • DHCP option 121 (Classless Static Route) is configured on the DHCP server to advertise a default route.
  • The host CLI 'net-vdl2 -l' command output indicates the gateway IP is not set. As seen in the example below it will have a value of 0.0.0.0:
NSX VDS:        <VDS Name>
VDS ID: <ID>
MTU:    9000
Segment ID:     <ID>
AddrType6:      Not Configured
Segment ID6:    ::
Transport VLAN ID:      <VLAN>
VTEP Count:     2
Forwarding Mode:        IPv4_ONLY
CDO status:     enabled (deactivated)
VTEP Interface: vmk10
DVPort ID:      <UUID>
Switch Port ID: <PORT ID>
Endpoint ID:    0
VLAN ID:        <VLAN>
Label:          129025
Uplink Port ID: <PORT ID>
Is Uplink Port LAG:     No
IP:             <IP ADDRESS>
Netmask:        255.255.255.0
Segment ID:     <ID>
IPv6:           ::
Prefix Length:  0
Segment ID6:    ::
GW IP:          0.0.0.0
GW MAC:         00:00:00:00:00:00
GW V6 IP:               ::
GW V6 MAC:              00:00:00:00:00:00
IP Acquire Timeout:     0
IPv6 Acquire Timeout:   0
Multicast Group Count:  0
Is DRVTEP:      No
State4:         UP   : NORMAL
State6:         DOWN : INIT/NO_IP
  • On the host in the /var/run/log/vmkernel logs, similar messages like the following maybe seen:  
2024-10-30T15:55:13.222Z In(182) vmkernel: cpu8:2097849)Tcpip: 1208: gateway = 0x0

2024-10-30T15:55:13.227Z In(182) vmkernel: cpu8:2097849)Net: 474: GatewayIPv4: 0.0.0.0, vmkIPv4: <VTEP INTERFACE IP ADDRESS>

Environment

VMware NSX-T

ESXi 8.X

Cause

  • The vtep interface gateway IP is set to 0.0.0.0 since classless route option 121 is configured on the DHCP server. Option 121 is not supported by the vmkernel.
  • On ESXi 7.X, the DHCP client does not request option 121 from the DHCP server, so the DHCP server will send router option 3 to the host in the DHCP offer packet. The DHCP client receives it, and ESXi parses option 3 and set its vmknic gateway IP with that information.
  • Please note when a DHCP discover message is sent by the DHCP client requesting option 121, then the DHCP server will respond by sending a DHCP offer packet with option 121, and not router option 3. Router option 3 would normally be the default method for automatically configuring a gateway IP on an interface.  
  • On ESXi 8.X there is a change in how the DHCP client behaves. That change leads to the DHCP client requesting option 121 as well as option 3 from the DHCP server, due to this the DHCP server does not send option 3 but only sends option 121 to the host. Please note that this is expected behaviour from the DHCP server. ESXi is not able to parse the option 121, and since it has no option 3 received, then it is not able to set the vmknic interface gateway IP. This results in a gateway IP of 0.0.0.0.

Resolution

No code fix available.

Workaround:

Remove the default route classless route option 121 from the DHCP server configuration.