Virtual Machines (VMs) are Failing to Receive IP Addresses from NSX in a TKGm Environment
search cancel

Virtual Machines (VMs) are Failing to Receive IP Addresses from NSX in a TKGm Environment

book

Article ID: 417737

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Management

Issue/Introduction

Virtual Machines (VMs) are failing to receive IP addresses from NSX in a TKGm 2.5.4 environment, significantly disrupting production workloads across multiple clusters and numerous VMs. The issue is evidenced by kubectl describe output showing the status "WaitingForIPAllocation" and "VirtualMachineProvisioned: NotProvisioned" for the affected virtualmachine and machine resources.

Cause

The underlying cause is a failure in the ESXi host(s) scratch storage being decommissioned which was used by the nest-db service, which consequently prevents the nest-db agent from successfully starting on the affected ESXi host(s). The nest-db agent is essential for NSX-T's network communication and dynamic IP assignment. 

 

Resolution

  1. Perform a rolling reboot for all NSX managers.
  2. Set DRS to manual for the affected host(s).
  3. Reboot each host to clear the scratch partition issue.
  4. Wait for the nest-db agent to start successfully after the host comes back online.
  5. vMotion all affected VMs from the host(s) that experienced the agent failure to a host with a confirmed running nest-db agent. This process must be repeated for all hosts where the nest-db agent was in a failed state.