Periodic vs realtime NSX transport node statistics
search cancel

Periodic vs realtime NSX transport node statistics

book

Article ID: 395906

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Often, there are scenarios where the statistics for an entity are distributed across multiple Transport Nodes (example: service interface). In such cases, specific statistics are more or less than expected (for example, the number of dropped packets). This article addresses one such scenario, using service interface statistics as a reference. The same logic would apply to other entities as well.

Environment

  • VMware NSX
  • VMware NSX-T Data Center
  • VCF 9.x

Resolution

Example scenario

Following is the problem scenario to be addressed: A large number of dropped packets are observed for a Service Insertion (SI) interface and the intent is to understand (a) where the drops are and (b) investigate further if it is unexpected. The focus is on the reporting framework and understanding of (a). Once the drops are identified, the reason for the drops may involve further investigation.

Edge exporter service

The edge exporter service (aggregation service or agg service) is responsible for gathering statistics for an entity on various Transport Nodes (TNs) and reporting to Management Plane (MP). These statistics are aggregated and made available in the MP for consumption via API/GUI.

Procedure for debugging issues

To debug issues such as the one described above (with SI interface), check all the TNs for the specific entity. For example, in this case, proceed to ‘Networking’ in the GUI and select ‘Tier-0 Gateways’ → ‘INTERFACES AND GRE TUNNELS’ → ‘External and Service Interfaces’ → Select the specific interface → ‘Statistics’ → check the drop-down. This drop-down has the list of all the associated TNs for this interface (displayed as ‘ALL’). 

Cached statistics vs on-demand stats

When no selection is made in the drop-down, the list displays the cumulative value of various statistics that are gathered periodically from the TNs. These are the cached stats. When a selection is made for a specific TN, a real-time query is made to the TN and statistics are obtained on-demand. These are the on-demand statistics for the interface as reported by the TN. If there are a total of three TNs in the list for the entity and a summation of the on-demand statistics for all the three TNs is calculated, then values are comparable to the total value of cached stats, but are not exactly the same. This is by design and is not expected that the values converge over time.