Missing CPU and Memory metrics from DNAC Wi-Fi (AP) device in PM
search cancel

Missing CPU and Memory metrics from DNAC Wi-Fi (AP) device in PM

book

Article ID: 416282

calendar_today

Updated On:

Products

Network Observability Virtual Network Assurance CA Performance Management

Issue/Introduction

We noticed that we haven't had any CPU and memory information for the past two months for DNAC Wi-Fi (AP) devices in PM. Looking more closely this is also occurring for other DNAC devices like switches.

 

Enabled TRACE for SDNPerformance on the DC aligned to the VNA Gateway where this data is coming from.

Edit the /opt/IMDataCollector/apache-karaf/etc/org.ops4j.pax.logging.cfg file.

Locate this row/parameter: 

log4j2.logger.SDNPerformance.level = INFO

Change to: 

log4j2.logger.SDNPerformance.level = TRACE

No need to restart any DC service.

Watch the /opt/IMDataCollector/apache-karaf/data/log/vna.log file for the SDNID of the bad WIFI device (rest/sdns/<device itemid> to get SDNID).

https://<DA_host>:8582/rest/sdns/<itemid>

<SdnID>XXX</SdnID>

Watch if we see CPU or MEMORY data coming in for the SNDID.

If not, a VNA issue. If we do see data, see if the data is being loaded into the rate table base off the metric family for SDN Device metrics.

Don't forget to disable the TRACE mode on DC.

Check the sdn_devices_rate table in Vertica (DR).

select to_timestamp(tstamp), rinterval, duration, im_memoryutilization from sdn_devices_rate where item_id = <itemid> order by tstamp desc limit 100;  

select to_timestamp(tstamp), max(rinterval), max(duration), avg(im_memoryutilization), count(tstamp) from sdn_devices_rate where item_id = <itemid> group by tstamp order by tstamp desc limit 100;

For the non working device, the VNA Gateway is sending only MF AVAILABILITY:

Line 245617: TRACE | ocessor-thread-1 | 2025-09-30T11:10:18,824 | SDNPerformance | rmance.impl.SdnPerfDataProcessor  181 | createSdnResponse: | Processing 7748316/60 with metric = state, value = 1.0, pollGroupId = 9788, Normalized SampleTime Tue Sep 30 11:10:18 CEST 2025 vs. SampleTime Tue Sep 30 11:10:18 CEST 2025
Line 245618: TRACE | ocessor-thread-1 | 2025-09-30T11:10:18,824 | SDNPerformance | rmance.impl.SdnPerfDataProcessor  181 | createSdnResponse: | Processing 7748316/60 with metric = system_uptime, value = 1.7585744E+10, pollGroupId = 9788, Normalized SampleTime Tue Sep 30 11:10:18 CEST 2025 vs. SampleTime Tue Sep 30 11:10:18 CEST 2025
Line 245619: TRACE | ocessor-thread-1 | 2025-09-30T11:10:18,824 | SDNPerformance | rmance.impl.SdnPerfDataProcessor  140 | processDataPoints: | SDN item ID 7748316/60 of MF AVAILABILITY has 2 metrics at 1759223418000 with EOC time 1759223400000 and duration 599000

 

It looks like the VNA Gateway is not sending the MF CPU_AND_MEMORY for SND ID 7748316/60.

We will need to investigate the issue on the VNA Gateway side.

a) Get the <VNA_home>/collector/<DNAC_Engine_UUID>/repository/work/FilterConfiguration.json

b) Enable the OC_ACQUISITION in debug mode:

cd /opt/CA/VNA/wildfly/bin
./jboss-cli.sh --connect
/subsystem=logging/logger=OC_ACQUISITION:write-attribute(name=level,value=DEBUG)
/subsystem=logging/logger=SOUTHBOUND_UPDATES:write-attribute(name=level,value=TRACE)
/subsystem=logging/logger=DNAC_PLUGIN:write-attribute(name=level,value=DEBUG)
/subsystem=logging/logger=PERFORMANCE_SERVICE:write-attribute(name=level,value=DEBUG)

Collect the following information:

a) VNA MySQL database backup file.

https://techdocs.broadcom.com/us/en/ca-enterprise-software/it-operations-management/virtual-network-assurance/24-3/migrate.html#concept.dita_8bb75cf0ca0aacdc4a5dc1b58b4cb8aa94fd8b26_backupvnadb

b) All files from /opt/CA/VNA/wildfly/standalone/log/ directory.

c) All files from /opt/CA/VNA/collector/DNAC_xxxxxx/ directory which has the VNA OC_ACQUISITION in debug.

Environment

DX NetOps PM (Performance Management) and VNA (Virtual Network Assurance) 24.3.6

Cause

The API response from the DNAC controller is sending the deviceFamily field as null, which is a mandatory attribute for mapping performance metrics. Since the very first record lacks a device family, the entire dataset is ignored.

Prior to August, the DNAC responses included valid deviceFamily values. However, starting from August, a few responses began returning null for this field, which led to the data being discarded.

 

In the DNACDevicesHealthParser_GroovyFunction_EE-Managed_ThreadFactory-default-Thread-9_########.jason file:

  {
    "name": "<Device Name>",
    "ipAddress": "<IP Address>",
    "macAddress": null,
    "overallHealth": -3,
    "issueCount": 1,
    "interfaceLinkErrHealth": -1,
    "cpuUlitilization": null,
    "cpuHealth": -1,
    "memoryUtilizationHealth": -1,
    "memoryUtilization": null,
    "interDeviceLinkAvailHealth": 0,
    "interDeviceLinkAvailFabric": -1,
    "freeTimerScore": -1,
    "packetPoolHealth": -1,
    "wqePoolsHealth": -1,
    "wanLinkUtilization": -1,
    "clientCount": {},
    "interferenceHealth": {},
    "noiseHealth": {},
    "airQualityHealth": {},
    "utilizationHealth": {},
    "deviceFamily": null
  },

 

Resolution

Workaround:

1. Edit the file:
/opt/CA/VNA/collector/DNAC_###################/repository/TIM-INF/groovy/scripts/DNACDevicesHealthParser.groovy

2. Replace line 61 with the following code snippet: FYR - the existing content from line 61 is "devicesMetricsList.add(deviceMetrics)"

if (it.deviceFamily != null) {
   devicesMetricsList.add(deviceMetrics)
}

3. After updating the Groovy script, no service restart is required just wait for 3–4 polling cycles(30 mins) and then check the portal for device metrics.

Additional Information

We’ll productize this fix as part of the 25.4.2 release.