ESXi 9.x Upgrade Failure Due to Degraded NSX Host State
search cancel

ESXi 9.x Upgrade Failure Due to Degraded NSX Host State

book

Article ID: 435877

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The ESXi 9.x upgrade via SDDC Manager failed during pre-checks because the NSX Manager was unable to communicate with host management services due to RPC-layer errors.
    SDDC Manager LCM Logs - /var/log/vmware/vcf/lcm/lcm.log
    <Time stamp> ERROR [vcf_lcm,69#####487,ff84,auditId=f3#####4d-1b7edaeddf26,resourceType=NSX_T_MANAGER,resourceId=<nsx manager name>
    e=<nsx manager name> [c.v.v.c.n.s.c.c.ComplexHelpers,Scheduled-8] Exception occurred during NSX API invocation
    java.util.concurrent.ExecutionException: com.vmware.vapi.std.errors.ServiceUnavailable: ServiceUnavailable (com.vmware.vapi.std.errors.service_unavailable) (statusCode:503) => {
        messages = [],
        data = <null>,
        errorType = SERVICE_UNAVAILABLE
  • Review of the  /var/log/proton logs in NSX Manager logs identifies the following error signature:

    <Time stamp> INFO nsx-rpc:RPC_PROXY_CONN_PROVIDER: ... frame=rpc_msg { status { code: UNAVAILABLE error_msg: "Requested service vmware.nsx.agg_service.l2.L2QueryService is not registered with forwarder. Check with service provider" } }
  • Navigate to System > Fabric > Nodes > Host Transport Nodes. The affected host displays a status of Degraded.

  • Connectivity Check: Physical uplinks associated with the host's Virtual Distributed Switch (VDS) or N-VDS are reported as Down.

  • Service Status: The L2QueryService is unreachable because the transport path between the Manager and the Host agent is broken.

Environment

VMware NSX
VMware ESXi
VMware Cloud Foundation

Cause

The upgrade pre-check is a fail-safe mechanism. It requires all registered NSX services, such as the L2QueryService, to be functional to ensure that network segments and bridge profiles remain intact during the host reboot and vib migration. When physical uplinks are down, the RPC forwarder cannot register the necessary services, leading to the UNAVAILABLE status.

Resolution

To resolve this issue and proceed with the upgrade, perform the following steps:

  1. Physical Layer Restoration: Coordinate with the physical networking team to inspect cables, SFPs, and Top-of-Rack (ToR) switch ports. Ensure the links are administratively Up and signaling correctly.

  2. Verify Host Health: Once physical connectivity is restored, log into the NSX UI and confirm the Host Transport Node status has transitioned from Degraded to Success/Up.

  3. Service Validation: From the ESXi CLI, verify the NSX agent (nsxa) and rpc-proxy services are running:

    • /etc/init.d/nsx-proxy status

  4. Rerun Pre-check: Re-initiate the ESXi 9.x upgrade pre-check from the Lifecycle Manager or NSX Upgrade Coordinator.