Host to NSX Controller connectivity triage
search cancel

Host to NSX Controller connectivity triage

book

Article ID: 410581

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

There could be many reasons for the ESXi Host to NSX controller connectivity to be down. Some examples are: nsx-proxy service is down, certificate issues, network issues, and upgrades modifying some config files.

Environment

VMware NSX-T 4.x and later

Resolution

To help identify the root cause of the issue, there is a run book available on the ESXi host and NSX Edge.

  1. Login into the ESXi host as "root" and invoke nsxcli. Example:

    [root@esxi:~] nsxcli
    esxi.fqdn>

    Log into the NSX Edge as the "admin" user. Example:

    nsxedge>

  2. Set the running-level for runbook to internal. Example:

    esxi.fqdn> set service nsx-ods running-level internal
    Running level: internal

    Note on runbook levels:

    The ODS Runbooks come in two flavors depending on the intended users:

    • External - runbooks the customers can run for their own research
    • Internal - runbooks intended for use by VMware internal employees

    Note: You can also list the available runbooks by running the command "get runbook"

  3. Run this command: "start invocation runbook nsx_proxy"

    The output of this command should give the details of the connectivity between ESXi host and Transport Node and any recommendations on how to resolve the issue.

    Example output from a working environment (no issues reported):

    esxi.fqdn> start invocation runbook nsx_proxy
    Runbook Invocation Report

    Invocation ID   : ########-####-####-####-############
    Timestamp       : DATE TIME
    System Info
        Host Name       : esxi.fqdn
        OS Name         : VMkernel
        OS Version      : 8.0.3
        Arch            : x86_64
    Runbook Info
        Runbook ID      : nsx_proxy
        Version         : 2.1
        Publisher       : Broadcom Inc.
    Report Type     : VALID
    Conclusion      : No connectivity issues has been detected for Controller channel and Appliance channel.
    Recommendation  : No action needs to be taken.
    Artifact Bundle : <none>
    Steps

        Step Number     : 1
        Step Action     : Fetch nsx-proxy service status.
        Step Result     : Service nsx-proxy is up

        Step Number     : 2
        Step Action     : Fetch nsx-opsagent service status.
        Step Result     : Service nsx-opsagent is up

        Step Number     : 3
        Step Action     : Check whether Maintenance mode is enabled on Transport Node.
        Step Result     : Maintenance mode checks performed, no issues detected on this transport node.

        Step Number     : 4
        Step Action     : Check whether Appliance nodes settings are empty or duplicate.
        Step Result     : No issues has been detected for Appliance channel, total Appliance channel nodes present in configuration file are: 3.

        Step Number     : 5
        Step Action     : Check whether Controller nodes settings are empty or duplicate.
        Step Result     : No issues has been detected for Controller channel, total Controller channel nodes present in configuration file are: 3.

        Step Number     : 6
        Step Action     : Perform ping and port status check for all Appliance channel nodes.
        Step Result     : Ping and port test are successful for all Appliance channel nodes.

        Step Number     : 7
        Step Action     : Perform ping and port status check for all Controller channel nodes.
        Step Result     : Ping and port test are successful for all Controller channel nodes.

        Step Number     : 8
        Step Action     : Perform certification validation for all Controller channel nodes.
        Step Result     : Controller channel nodes certification validation passed.

        Step Number     : 9
        Step Action     : Perform certification validation for all Appliance channel nodes.
        Step Result     : Appliance channel nodes certification validation passed.

        Step Number     : 10
        Step Action     : Perform CRL validation for all Controller channel nodes.
        Step Result     : No issues detected for Controller channel certificate revocation list validation.

        Step Number     : 11
        Step Action     : Perform CRL validation for all appliance channel nodes.
        Step Result     : No issues detected for Appliance channel certificate revocation list validation.



  4. When finished, restore the running-level back to the default external level:

    esxi.fqdn> set service nsx-ods running-level external
    Running level: external