Edge node configuration state from the NSX Ui shows as failed.
search cancel

Edge node configuration state from the NSX Ui shows as failed.

book

Article ID: 376854

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • From NSX UI --> System --> Fabric --> Nodes --> Edge Transport Nodes, the "Configuration State" for the affected node shows as "Failed".

  • When we clicked on "Failed", it shows the error - "Host configuration: Failed to send the HostConfig message. [TN=TransportNode/########-####-####-####-########6f6c]. Reason: Mac address for a vnic null is not found on edge node /infra/sites/default/enforcement-points/default/edge-transport-node/########-####-####-####-########6f6c."".



  • From the NSX Manager logs, the desired_state_manager.json file reports the transport node status as below:

    "/nsxapi/api/v1/transport-nodes/state": {
    {
      "details": [
        {
          "failure_code": 8804,
          "failure_message": " Host configuration: Failed to send the HostConfig message. [TN=TransportNode/########-####-####-####-########6f6c]. Reason: Mac address for a vnic null is not found on edge node /infra/sites/default/enforcement-points/default/edge-transport-node/########-####-####-####-########6f6c.",
          "state": "failed",
          "sub_system_id": "########-####-####-####-########6f6c",
          "sub_system_type": "Host"
        }
      ],
      "failure_code": 8804,
      "failure_message": "Host configuration failed. Number of retries : 11. Next retry attempt will be between 2024-Aug-19 23.10.31 PM and 2024-Aug-19 23.13.31 PM (UTC).",
      "maintenance_mode_state": "DISABLED",
      "node_deployment_state": {
        "details": [],
        "failure_code": 0,
        "failure_message": "",
        "state": "NODE_READY"
      },
      "state": "failed",
      "transport_node_id": "########-####-####-####-########6f6c"
    }

Environment

VMware NSX 4.x

Cause

This issue can be seen if there is a misconfiguration / mismatch in the UplinkHostSwitchProfile being used.

For example, from desired_state_manager.json file, below is the configuration for the two edges that are part of the same cluster:

Working edge node:

"display_name": "edge01",
"failure_domain_id": "########-####-####-####-########aafb",
"host_switch_spec": {
  "host_switches": [
    {
      "cpu_config": [],
      "host_switch_id": "########-####-####-####-########6608",
      "host_switch_mode": "STANDARD",
      "host_switch_name": "Edge-NVDS",
      "host_switch_profile_ids": [
        {
          "key": "UplinkHostSwitchProfile",
          "value": "########-####-####-####-########d478"
        }
      ],
      "host_switch_type": "NVDS",
      "ip_assignment_spec": {
        "ip_pool_id": "########-####-####-####-########1c2c",
        "resource_type": "StaticIpPoolSpec"
      },
      "is_migrate_pnics": false,
      "not_ready": false,
      "pnics": [
        {
          "device_name": "fp-eth0",
          "uplink_name": "uplink-1"
        },
        {
          "device_name": "fp-eth1",
          "uplink_name": "uplink-2"
        }

Problematic edge node:

"display_name": "edge02",
"failure_domain_id": "########-####-####-####-########aafb",
"host_switch_spec": {
  "host_switches": [
    {
      "cpu_config": [],
      "host_switch_id": "########-####-####-####-########6608",
      "host_switch_mode": "STANDARD",
      "host_switch_name": "Edge-NVDS",
      "host_switch_profile_ids": [
        {
          "key": "UplinkHostSwitchProfile",
          "value": "########-####-####-####-########4988"
        }
      ],
      "host_switch_type": "NVDS",
      "ip_assignment_spec": {
        "ip_pool_id": "########-####-####-####-########1c2c",
        "resource_type": "StaticIpPoolSpec"
      },
      "is_migrate_pnics": false,
      "not_ready": false,
      "pnics": [
        {
          "device_name": "fp-eth0",
          "uplink_name": "uplink-1"
        },
        {
          "device_name": "fp-eth1",
          "uplink_name": "uplink-2"
        }

From the above, we can see that, both the nodes are mapped to fp-eth0 and fp-eth1. However, the UplinkHostSwitchProfile is different. The working node uses "########-####-####-####-########d478" and the problematic node uses "########-####-####-####-########4988".

Now, checking the configuration of these UplinkHostSwitchProfile, we can see that, the working node's UplinkHostSwitchProfile, i.e. ########-####-####-####-########d478 has two uplinks. Whereas, the problematic node's UplinkHostSwitchProfile, i.e. ########-####-####-####-########4988 has four uplinks. However, still the problematic node reported only two uplinks mapped. This is the misconfiguration / mismatch.

"display_name": "host-uplinkprofile",
"id": "########-####-####-####-########4988",
"overlay_encap": "GENEVE",
"resource_type": "UplinkHostSwitchProfile",
"tags": [],
"teaming": {
  "active_list": [
    {
      "uplink_name": "uplink-1",
      "uplink_type": "PNIC"
    },
    {
      "uplink_name": "uplink-2",
      "uplink_type": "PNIC"
    },
    {
      "uplink_name": "uplink-3",
      "uplink_type": "PNIC"
    },
    {
      "uplink_name": "uplink-4",
      "uplink_type": "PNIC"
    }
  ],
  "policy": "LOADBALANCE_SRCID",
  "rolling_order": false
},
  "transport_vlan": 1046
}

"display_name": "edge-uplinkprofile",
"id": "########-####-####-####-########d478",
"overlay_encap": "GENEVE",
"resource_type": "UplinkHostSwitchProfile",
"tags": [],
"teaming": {
  "active_list": [
    {
      "uplink_name": "uplink-1",
      "uplink_type": "PNIC"
    },
    {
      "uplink_name": "uplink-2",
      "uplink_type": "PNIC"
    }
          ],
          "policy": "LOADBALANCE_SRCID",
          "rolling_order": false
        },
        "transport_vlan": 1046
      },

Resolution

For the problematic edge node, using the below steps, change the uplink profile to the correct one:

  1. Place the problematic edge node into maintenance mode (From NSX UI --> System --> Fabric --> Nodes --> Edge Transport Nodes --> Select the problematic node --> Actions --> Enter NSX Maintenance Mode).

  2. Again, select the same edge node --> click on the pencil icon for editing the settings --> For the concerned switch, select the correct "Uplink Profile" as needed --> scroll down to map the "Virtual NICs" against the "Uplinks" and click on Save.