Some common scripts to help isolate the intermittent latency/Packet drop issue.
search cancel

Some common scripts to help isolate the intermittent latency/Packet drop issue.

book

Article ID: 382225

calendar_today

Updated On:

Products

VMware vDefend Firewall VMware vDefend Firewall with Advanced Threat Prevention

Issue/Introduction

This article provides generic scripts that can help isolate intermittent latency and packet drop issues in VMware by Broadcom environments. These scripts are designed to capture network flow information, monitor network statistics, and perform packet captures. Please ensure you modify the scripts' content to fit your specific use case and environment before executing them.

Pre-Requisites:

  • You should have access to the ESXi host and relevant privileges to run the scripts.
  • These scripts should be executed during the occurrence of the issue, preferably within one hour of hitting the issue to collect meaningful data.

Resolution

These are the script samples that can be used in various troubleshooting scenarios. User has to determine the right script(s) to use for a specific use case


Script 1: To collect the getflows output.

Explanation/Instructions:
- Creates a folder for the capture session.
- Captures DFW getflows information for the IP ##.##.##.##
- Runs every 8 seconds for 12 hours (5400 times).
- Logs are stored in /var/run/log/getflows/
- Change the nic-xxxxx-ethx-vmware-sfw.2 to match the environment.
- Make necessary changes to time to run, Destination directory, etc

#!/bin/bash
mkdir /var/run/log/getflows/ for i in $(seq 1 1 14) ; do  currDate=$(date +%Y-%m-%d_%H-%M-%S)  mkdir /var/run/log/getflows/$currDate # 5400 * 8 = 43200 equates to 12 hours.  for i in $(seq 1 1 5400) ; do   echo "======================================" >> /var/run/log/getflows/$currDate/getflows.txt;   date >> /var/run/log/getflows/$currDate/getflows.txt;   vsipioctl getflows -f nic-xxxxxxx-ethX-vmware-sfw.2 | grep "##.##.##.##" >> /var/run/log/getflows/$currDate/getflows.txt;   sleep 8;  done done

 

Script 2: To collect the netstats output

Instructions:
- Collects netstats every second for 21 hours.
- Removes older files to ensure only the latest 500 files are retained.
- Logs are stored in /var/run/log/netstats/.

#!/bin/bash

mkdir /var/run/log/netstats/

for i in $(seq 1 1 75600) ; do
 currDate=$(date +%Y-%m-%d_%H-%M-%S)
 net-stats -i 1 -ticqQWS -A > /var/run/log/netstats/netstats-timed-$currDate

# Directory to monitor

 directory="/var/run/log/netstats/"


# Number of new files to keep

 x=500


# Find the newest files and keep only X of them

 ls -t "$directory" | tail -n +$((x+1)) | while read file; do

    rm -f "$directory/$file"

 done

 sleep 8;

done

 

Script 3: To collect the packet captures

Instructions:

- Captures network traffic for IP ##.##.##.## using pktcap-uw with specific filters.
- Each packet capture file will be 500 MB in size.
- Keeps only the latest 20 .pcapng files in the capture directory.
- Stops the capture after 10 hours (2016 cycles of 300 seconds).
- Change the nic-xxxxx-ethx-vmware-sfw.2 to match the environment.
- Make necessary changes to time to run, Destination directory, etc

#!/bin/bash

mkdir /var/run/log/packetcapture/

nohup pktcap-uw --capture PreDVFilter,PostDVFilter --dvfilter nic-XXXXXXX-ethX-vmware-sfw.2 --ip ##.##.##.##  --ng --snaplen 150 -C 500 -o /var/run/log/packetcapture/PreDVF_PostDVF.pcapng &

for i in $(seq 1 1 2016) ; do

# Directory to monitor

 directory="/var/run/log/packetcapture/"


# Number of new files to keep

 x=20


# Find the newest files and keep only X of them

 ls -t "$directory" | tail -n +$((x+1)) | while read file; do

    rm -f "$directory/$file"

 done

 sleep 300

done

kill $(lsof |grep pktcap-uw |awk '{print $1}'| sort -u)




Execution Steps:

  1. Create the Scripts: Save each of the above scripts into separate .sh files on the ESXi host:

    • packetcapture.sh
    • getflows.sh
    • netstats.sh
  2. Run the Scripts: To execute the scripts, run the following commands on the ESXi host. These will run the scripts in the background using setsid.

     
    setsid sh packetcapture.sh &
    setsid sh getflows.sh &
    setsid sh netstats.sh &

    Example:

     
    [root@esx-04:~] setsid sh packetcapture.sh &
    nohup: appending output to nohup.out
    [1]+ Done setsid sh packetcapture.sh
     
    [root@esx-04:~] setsid sh getflows.sh &
    [1]+ Done setsid sh getflows.sh
     
    [root@esx-04:~] setsid sh netstats.sh &
    [1]+ Done setsid sh netstats.sh
  3. Stop the Scripts Early (Upon Issue Occurrence): If you encounter the latency/packet drop issue, stop the scripts to prevent unnecessary data collection. Use the following commands to kill the running processes:

     
    kill $(ps -Tcjstv | grep packetcapture.sh | grep -v grep | awk '{print $1}')
    kill $(ps -Tcjstv | grep getflows.sh | grep -v grep | awk '{print $1}')
    kill $(ps -Tcjstv | grep netstats.sh | grep -v grep | awk '{print $1}')


Important Note:

Data should be collected ideally within 1 hour of experiencing the issue, as data collected beyond that timeframe may rollover and may not be useful for troubleshooting.