NSX-T "transport node down" alarms not showing in vRealize Operations
search cancel

NSX-T "transport node down" alarms not showing in vRealize Operations

book

Article ID: 318820

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite) VMware NSX

Issue/Introduction

The purpose of this article is to document a known issue with vRealize Operations and NSX-T integration, and to provide a workaround/resolution for the issue.

Symptoms:

  • NSX-T "transport node down" alarms not showing in vRealize Operations
  • NSX-T adapter collections fail
  • Credentials test successfully for the NSX-T Adapter
  • Restarting collections for the NSX-T Adapter does not help
  • NSX-T logs show API calls taking an excessive amount of time, as observed in /var/log/proxy/localhost_access_log.txt:
2021-10-15T18:10:49.929Z ip.ip.ip.ip - "GET /api/v1/logical-router-ports/uuiduuid-uuid-uuid-uuid-uuiduuiduuid/arp-table?source=realtime&transport_node_id=uuiduuid-uuid-uuid-uuid-uuiduuiduuid HTTP/1.1" 400 331 10018 10017
2021-10-15T18:11:00.238Z ip.ip.ip.ip - "GET /api/v1/logical-router-ports/uuiduuid-uuid-uuid-uuid-uuiduuiduuid/arp-table?source=realtime&transport_node_id=uuiduuid-uuid-uuid-uuid-uuiduuiduuid HTTP/1.1" 200 332 6853 6853
2021-10-15T18:35:53.903Z ip.ip.ip.ip - "GET /api/v1/logical-router-ports/uuiduuid-uuid-uuid-uuid-uuiduuiduuid/arp-table?source=realtime&transport_node_id=uuiduuid-uuid-uuid-uuid-uuiduuiduuid HTTP/1.1" 200 326 6047 6046
  • NSX-T Collector logs from the same duration show similar errors to:
2021-10-15T18:10:49,929+0000 ERROR [NSX-T Metric Collector thread 5, time: Fri Oct 15 16:56:45 UTC 2021] (3645) com.vmware.adapter3.nsxt.collector.metrics.LogicalRouterMetricCollector.getArpEntryCount - Error in fetching logical Router Port ARP table, exception caused is: com.vmware.vapi.std.errors.InvalidRequest: InvalidRequest (com.vmware.vapi.std.errors.invalid_request) => {
    messages = [],
    data = struct => {error_message=Failed in MP to fetch status/statistic of logical router port uuiduuid-uuid-uuid-uuid-uuiduuiduuid: com.vmware.nsx.management.messaging.model.RpcRequestError: com.vmware.nsx.management.messaging.exceptions.MessagingException: RPC request timed out.[requestId=uuiduuid-uuid-uuid-uuid-uuiduuiduuid]., httpStatus=BAD_REQUEST, error_code=6600, module_name=Nsx Aggregation Service L3},
    errorType = INVALID_REQUEST
}
2021-10-15T18:36:04,606+0000 ERROR [NSX-T Metric Collector thread 5, time: Fri Oct 15 18:26:47 UTC 2021] (3645) com.vmware.adapter3.nsxt.collector.metrics.LogicalRouterMetricCollector.getArpEntryCount - Error in fetching logical Router Port ARP table, exception caused is: com.vmware.vapi.std.errors.InvalidRequest: InvalidRequest (com.vmware.vapi.std.errors.invalid_request) => {
    messages = [],
    data = struct => {error_message=Failed in MP to fetch status/statistic of logical router port uuiduuid-uuid-uuid-uuid-uuiduuiduuid: null., httpStatus=BAD_REQUEST, error_code=6600, module_name=Nsx Aggregation Service L3},
    errorType = INVALID_REQUEST
}
2021-10-15T18:56:09,951+0000 ERROR [NSX-T Metric Collector thread 5, time: Fri Oct 15 18:26:47 UTC 2021] (3645) com.vmware.adapter3.nsxt.collector.metrics.LogicalRouterMetricCollector.getArpEntryCount - Error in fetching logical Router Port ARP table, exception caused is: com.vmware.vapi.std.errors.InvalidRequest: InvalidRequest (com.vmware.vapi.std.errors.invalid_request) => {
    messages = [],
    data = struct => {error_message=Failed in MP to fetch status/statistic of logical router port uuiduuid-uuid-uuid-uuid-uuiduuiduuid: null., httpStatus=BAD_REQUEST, error_code=6600, module_name=Nsx Aggregation Service L3},
    errorType = INVALID_REQUEST
}
  • There are the following errors in the vobd.log on some of the affected hosts:
cat vobd.log | grep exporter | less:
2021-04-08T03:52:57.321Z: [UserWorldCorrelator] 1163239963140us: [vob.uw.core.dumped] /usr/lib64/vmware/nsx-exporter/nsx-exporter(2101932) /var/core/nsx-exporter-zdump.000
2021-04-08T03:52:57.321Z: [UserWorldCorrelator] 1163240303853us: [esx.problem.application.core.dumped] An application (/usr/lib64/vmware/nsx-exporter/nsx-exporter) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/nsx-exporter-zdump.000.
2021-04-13T04:29:07.886Z: [UserWorldCorrelator] 289986725738us: [vob.uw.core.dumped] /usr/lib64/vmware/nsx-exporter/nsx-exporter(2102385) /var/core/nsx-exporter-zdump.001
2021-04-13T04:29:07.886Z: [UserWorldCorrelator] 289986818907us: [esx.problem.application.core.dumped] An application (/usr/lib64/vmware/nsx-exporter/nsx-exporter) running on ESXi host has crashed (1 time(s) so far). A core file may have been created at /var/core/nsx-exporter-zdump.001.
2021-04-13T05:09:33.762Z: [UserWorldCorrelator] 292412601940us: [vob.uw.core.dumped] /usr/lib64/vmware/nsx-exporter/nsx-exporter(4402615) /var/core/nsx-exporter-zdump.002
2021-04-13T05:39:46.751Z: [UserWorldCorrelator] 294225593683us: [vob.uw.core.dumped] /usr/lib64/vmware/nsx-exporter/nsx-exporter(4421671) /var/core/nsx-exporter-zdump.003
  • The affected hosts also have nsx-exporter-zdump files in /var/core/
  • esxcli software vib list shows nsx-exporter version less than 3.1.2.1. 3.1.3.0 or 3.2.0.0.
Example: nsx-exporter 3.1.2.0.0-7.0.17883598 VMware VMwareCertified 2021-06-21


This can happen on any ESX host with an nsx-exporter installed which has the right traffic running on it, check /var/core/ for the existence of nsx-exporter-zdump files for potentially affected hosts.

Environment

VMware vRealize Operations 8.x
VMware NSX-T Data Center
VMware NSX-T Data Center 3.x
VMware NSX-T

Cause

This is caused by a known issue with having DFW IPFIX enabled as well as L7 rules on NSX-T versions less than 3.1.2.1, 3.1.3.0, or 3.2.0.0.

Resolution

To resolve this issue, upgrade to NSX-T 3.1.2.1. 3.1.3.0, 3.2.0.0, or later.

Workaround:
If upgrading is not currently an option, disable DFW IPFIX: (from Plan & Troubleshoot > IPFIX > Firewall IPFIX Profiles)