Datapath capture to Diagnose Application & Connectivity Issues when using NSX Native loadbalancer
search cancel

Datapath capture to Diagnose Application & Connectivity Issues when using NSX Native loadbalancer

book

Article ID: 417724

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • This article outlines a structured methodology for collecting the critical data, logs, and packet captures necessary to diagnose application level connectivity issues, such as application down for a specific span of time , intermittent VIP access or user disconnections, when utilizing the NSX Native Load Balancer.

  • This essential information enables Broadcom Support to effectively perform datapath connectivity diagnostics on the given timestamp.

Environment

VMware NSX 

Resolution

Required Scoping Info :

 Gather the following Scoping Details 

  • NSX Version:
  • Load Balancer Details:
    • Name:
    • UUID:
    • Virtual Server Name:
    • Pool Members (IPs):
  • Edge Details (for the LB):
    • Active Edge (Name & IP):
    • Standby Edge (Name & IP):
    • Are there any Active Edge Alarms visible on the NSX UI?

Triage Checklist

Detailed Problem statement:
 
Issue Timeline and Scope

  • Problem Start Time: [Date and Time]

  • Problem End Time (if applicable): [Date and Time]

Scope of issue:

  • Scope: Is all VIP traffic hosted on the impacted Load Balancer affected? (Yes/No)

  • Specific VIPs: If only specific VIPs are impacted, please list the specific VIP names/IPs.

  • Accessibility: Are the affected VIPs completely inaccessible or is the inaccessibility intermittent?

  • User Impact: Are all users attempting to use the LB impacted, or is it isolated to only a specific subnet of users?

Below is the structured methodology for collecting the critical data, logs, and packet captures necessary to diagnose application connectivity issues :


Stage 1: Initial Connectivity Tests

The following basic network tests must be performed to check for reachability, ping loss, or latency issues:

  1. Execute Ping tests to the Virtual IP (VIP):

  2. Execute Ping tests to each Pool Member IP address 

  3. TCP port connectivity tests towards the VIP 

  4. TCP port connectivity tests towards each of  pool members.

Stage 2: Error Message Capture (Application Layer)

Capture the exact error message observed when accessing the services directly and via the VIP:

  1. Access the VIP and capture the complete error message displayed on the browser.

  2. Access each pool member directly by IP address/hostname and capture the complete error message displayed on the browser.

  3. Capture HAR results (HTTP Archive) while reproducing the issue by following the instructions in the referenced Broadcom KB: How to generate HAR file (KB 205795).

Stage 3: Enable Debug Logging (NSX Edge)

Enable verbose logging on the NSX Edge to capture detailed flow information during the issue timeframe.

  1. Enable DEBUG logging on the Load Balancer page in the NSX-T UI.

  2. Enable Access Logs on the Virtual Server page in the NSX-T UI.

  3. Crucial Note: Gather logs from the Edge BEFORE disabling debug logging, as debug logs are immediately deleted upon disabling the setting.

  4. Revert Logging: Ensure debug logging is reverted to the default level immediately after log capture is complete.

Stage 4: Packet Capture and Log Collection

The following deep inspection data and logs are mandatory for complete datapath analysis:

  1. Packet Capture (PCAP): Perform a packet capture on the LB service interface on the active Edge node. Use the exclusive packet capture method defined in the following Broadcom KB: Configuring Exclusive Packet Capture for Load Balancer (KB 345763).

  2. Application TCP Dump: Capture a TCP dump on the application hosted pool member simultaneously during the time the error is reproduced.

Stage 5: Essential historical data from VROPs :
Incase if VROPs is configured on your infrastructure Please collect the required graphs outlined in the below KB:

Troubleshooting NSX Native Load Balancer Issues using VMware Aria Operations

Mandatory Logs Required :

  • Edge Log Bundle: Download the Edge log bundle with core file enabled.

  • Manager Log Bundle: Download the NSX Manager log bundle.